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3.1  INTRODUCTION 


The  chi-square  test,  invented  by  Karl  Pearson  in  1900,  is  not  only  the 
oldest  test  of  fit,  but  the  oldest  non-trivial  test  of  significance.  Wiile 
it  is  inferior  in  power  to  other  classes  of  tests  of  fit,  the  Pearson  test  is 
unexcelled  in  ease  and  flexibility  of  use.  It  applies  with  little  modification 
to  the  problems  of  testing  fit  to  parametric  families  of  distributions,  to 
discrete  distributions,  and  to  multivariate  distributions.  Recent  variations 
of  the  Pearson  statistic  have  improved  the  flexibility  of  chi-square 
techniques,  especially  when  unknown  parameters  must  be  estimated  in  the 
hypothesized  family.  This  chapter  focuses  on  those  variations  of  the  chi-square 
which  appear  most  useful  to  practitioners,  with  briefer  comments  and 
references  for  other  aspects  of  the  subject.  Numerical  examples  are  given  in 
Section  3.2.4  for  the  Pearson  statistic  and  in  Section  3.3.3  for  some  newer 
chi-square  statistics.  In  addition.  Section  3.4.2  illustrates  the  use  of 
chi-square  techniques  in  the  less  common  situations  of  multivariate  observations 
and  censored  data.  Recommendations  on  the  use  of  chi-square  techniques  in 
practice  appear  in  Sections  3.2.5  and  3.4.1. 


•Preparation  of  this  chapter  was  supported  in  part  by  the  Air  Force  Office 
of  Scientific  Research  under  Grant  AFOSR  77-3291.  The  author  is  grateful 
to  Dr.  Daniel  Mihalko  for  his  assistance. 
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3.J  niL  I'LARSON  CHI-SgUARl-l  STATISTIC 
3 . J .1  Simple  Hypothesis 

To  test  the  simple  hypothesis  that  a random  sample  has  the 

distribution  function  F(x),  Pearson  partitioned  the  range  of  X^  into  M cells, 
say  li, li.,.  If  are  the  observed  number  of  X.'s  in  these  cells, 

I M I M J 

then  has  the  binomial  distribution  with  parameteis  n and 

p.  = P(X,  falls  in  E.)  = / dFU)  (3-1) 

^ ^ E. 

V 

when  the  null  liypothesis  is  true.  Pearson  reasoned  that  the  differences 
N.-np.  between  observed  and  expected  cell  frequencies  express  lack  of  fit 

ot  the  data  to  F,  and  he  sought  an  appropriate  function  of  these  differences 
for  use  as  a measure  of  fit. 

Pearson's  argument  here  was  in  three  stages:  (i)  The  quantities  N^-np^ 
have  in  large  samples  approximately  a multivariate  normal  distribution, 
and  this  distribution  is  nonsingular  if  only  M-1  of  the  cells  are  considered. 

(ii)  If  Y =■  (Yj...,Yp)'  has  a nonsingular  p-variate  normal  distribution 

then  the  quadratic  form  (Y-p)  ' E” ^Y-a)  appearing  in  the  exiK>nent  of 
the  density  function  has  the  x"(p1  distribution  as  a function  of  Y.  Here  of 
course  u is  the  p- vector  of  means,  and  E is  the  pxp  covariance  matrix  of  Y. 

(iii)  Computation  shows  that  if  Y =■  (Nj-npj '^M- 1 1 ^ quadratic 

form  is 

, M (N.-np.)*" 

- I > ‘ . 

1-1  "I’l 

which  therefore  has  approximately  the  x^CM-l)  uull  distribution  in  large 


samples.  This  is  the  Pearson  chi-square  statistic. 


This  elegant  argument  will  reappear  in  our  survey  of  recent  advances 
in  chi-square  tests.  Pearson  reduced  the  problem  of  testing  fit  to  the 
problem  of  testing  whether  a multinomial  distribution  has  cell  probabilities 
Pj  given  by  (3.1).  This  problem,  and  the  statistic  do  not  depend  on  whether  F 
is  univariate  or  multivariate,  discrete  or  continuous.  But  if  F is  continuous,  considera- 
tion of  only  the  cell  frequencies  does  not  fully  use  the  information  available  in  the 
observations  . Thus  the  flexibility  and  relative  lack  of  power  of 
stem  from  the  same  source. 

3.2.2  Composite  Hypothesis 

It  is  common  to  wish  to  test  the  composite  hypothesis  that  the  distribution 

function  of  the  observations  X^  is  a member  of  a parametric  family  {F(*|e):  0 in  fl), 

where  is  a p-dimensional  parameter  space.  Pearson  recommended  estimating  0 by  an 

estimator  0 (a  function  of  X,,...,X  ),  and  testing  fit  to  the  distribution  F(‘|o  ). 

n in  * n 

Thus  the  estimated  cell  probabilities  become 

p.(e„)  = f aF(x|5j 
^i 

and  the  Pearson  statistic  is  . ^ 

, M [N.-np  (e  )1^ 

* CS„)  - I — — • 

i=l  np.(0^) 

Pearson  did  not  think  that  estimating  0 changes  the  large  sample  distribution  of 
2 

X , at  least  when  0^  is  consistent.  In  this  he  was  wrong.  It  was  not  until 

2 ' 

1924  that  Fisher  showed  that  the  limiting  null  distribution  of  X (0^)  is  not 
2 

)(  (M-1),  and  that  this  distribution  depends  on  the  method  of  estimation  used. 

Fisher  argued  that  the  appropriate  method  of  estimation  is  maximum 
likelihood  estimation  based  on  the  cell  frequencies  N^.  This  g^uped 
daXa  MLE  is  the  solution  of  the  equations 


M N.  3p,(9) 


0,  k^l,>,a)P 


(3.2) 


obtained  by  differentiating  the  logarithm  of  the  multinomial  likelihood 
_function.  Fishei  showed  further  that  an  asymptotically  equivalent  estimator 


^8  06  19  09fc> 
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can  be  obtained  by  choosing  0 to  minimize  X^CSJ  for  the  observed  N^.  This 
mouinton  chri-&qua>ie  citimato^  is  the  solution  of 


iUi^J  30. 


0.  k - 1,, 


(3.3) 


Let  us  denote  either  estimator  by  <3^.  Then  X^CO^)  is  conceptually  the 
Pearson  statistic  for  testing  fit  to  l'(*l0j^).  the  member  of  the  family 
{F(x(0)}  which  is  closest  to  the  data  if  the  Pearson  statistic  is  used  as  a 
measure  of  distance.  Fisher  showed  that  the  Pcrt’Lion-Fi^hfc’i 

7 _ 7 

X“10|^)  has  the  x*(M‘P-n  distribution  under  the  null  hyjHJthesis,  no  matter 

what  0 in  il  is  the  true  value.  This  is  the  famous  "lose  one  degree  of  freedom  for  each 
parameter  estimated"  result. 

Neyman  (1949)  noted  that  another  estimator  as>Tnptot ical ly  equivalent 
to  0^^  can  be  obtained  by  minimizing  the  modified  chi-square  statistic 

M lN.-np.(0)p’ 

I “ij  • 

i-l  "^i 

This  mou'muffl  modified  chc-Sqiux’n’  I’iftmufp’i  is  the  solution  of 


M p.(0)  3p.l0) 


i-l  Ni 


(3.4) 


Since  for  the  purpcisos  of  large  s:unple  theory  this  estimator  is  interchangable 
with  the  previous  two,  call  it  also  0^^  to  minimize  notation.  Ncyman*s  remark 
is  important  because  equations  (3.11  are  iiK7re  often  solvable  in  closed  form 
than  are  (3.3)  and  (3.2). 

KXAMPLli.  Consider  the  chi-square  test  of  fit  to  the  family  of  density 
funct ions 

f(x|0)  ■ t((  * 1 * (3.5) 


with  d ■ (-1,1).  This  family  has  boon  used  as  a model  for  the  distribution  of 
the  cosine  of  the  scattering  angle  in  some  beam-scattering  experiments 
in  physics.  For  cells  - (a^^  j,  a^]  with 

-1  - «0<  aj  < ...  < a„  - 1. 


W' 
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ii 


we  have 


a- 


Pi (6)  = f(x|0)dx 

i-1 

It  is  easily  seen  that  neither  (3.2)  nor  (3.3)  has  a closed  solution,  while 
(3.4)  has  solution 


6 

n 


.1 

l»l 


M 


I (a?-a? 
i=l  1 1 


Substituting  this  value  in  the  Pearson  statistic  produces  an  easily  computed 

2 

test  of  fit  for  the  family  (3.5)  using  x (M-2)  critical  points. 

But  even  the  minimum  modified  chi-square  estimator  must  often  be 
obtained  by  numerical  solution  of  its  defining  equation.  If  cells 

= (a^  a^]  are  used  in  a chi-square  test  of  fit  to  the  normal  family 

F(x|p,o)  = ‘•'(^^)  .00  < X < <“, 


(4  is  the  standard  normal  distribution  function), 

a -p  a -u 
p.(p.o)  = ♦(-^)-*(^^). 


then 


It  takes  only  a moment  to  see  that  none  of  the  three  versions  of  0^  can  be 

obtained  algebraically,  so  that  recourse  to  numerical  solution  is  required.  Most 
computer  libraries  contain  efficient  routines  using  (for  example)  Newton's  method 
to  accomplish  the  solution. 

This  circumstance  calls  to  mind  Fisher's  warning  that  his  "lose 
one  degree  of  freedom  for  each  parameter  estimated"  result  is  not  true  when 

estimators  not  asymptotically  the  same  as  0^  are  used.  For  example,  in 


J 
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testing  univariate  normality  we  may  not  simply  use  the  raw  data  MLE's 


in  the  Pearson  statistic.  Chernoff  and  Lehmann  (1954)  studied  the  consequences 

of  using  the  raw  data  MLE  Pearson  statistic.  They  found  that 

X^(0  ) has  as  its  limiting  distribution  under  F(-|0)  the  distribution  of 
” P 

X^(M-p-l)  > I X (e)x^(l).  (3.6) 

k=l 


1 

Here  x‘'(M-p-l)  and  are  independent  chi-square  random  variables  with  the 

indicated  numbers  of  degrees  of  freedom.  The  numbers  ^j^(0)  satisfy 

2 " . 2 

0 ^ <1.  So  the  large  sample  distribution  of  X (0^)  is  not  x and  depends 

on  the  true  value  of  0.  All  that  can  be  said  in  general  is  that  the  correct 

2 2 

critical  points  fall  between  those  of  x (M-p-1)  and  those  of  x (M-1). 

2 

These  bounds  often  make  X (0^)  usable  in  practice,  especially  when  the  number 

of  cells  M is  large  and  the  munber  of  parameters  p is  small. 

3.2.3  Choosing  Cells  Iji  The  Pearson  Statistic 

A major  objection  to  the  use  of  chi-square  tests  has  been  the  arbitrariness 

introduced  by  the  necessity  to  choose  cells.  This  choice  is  guided  by  two 

considerations:  the  power  of  the  resulting  test,  and  the  desire  to  use  the 

2 

asymptotic  distribution  of  X as  an  approximation  to  tlie  exact  distribution  for 
sample  size  n.  These  issues  have  been  studied  in  detail  for  the  case  of  a 
simple  hypothesis,  i.e.,  the  case  of  testing  fit  to  a completely  specified 
distribution  F.  Recommendations  can  be  made  in  this  case  which  may  reasonably 
be  extended  to  the  case  of  testing  fit  to  a parametric  family  {F(.|0)}. 


7 


Munn  and  Wald  (1942)  initiated  the  study  of  the  choice  of  cells  in  the 
Pearson  test  of  fit  to  a continuous  distribution  F.  They  reconunended,  first, 
that  the  cells  be  chosen  to  have  equal  probabilities  under  the  hypothesized 
distribution  F.  The  advantages  of  such  a choice  are:  (1)  The  distance 

2 

supjF^(x)-F(x)  I to  the  nearest  alternative  indistinguishable  from  F > > X is 

maximized.  (2)  The  chi-square  test  is  unbiased.  (Mann  and  Wald  proved  only 

local  unbiasedness,  but  the  test  is  in  fact  unbiased  against  arbitrary  alternatives 

Fj.  This  is  not  true  when  the  cells  have  unequal  probabilities  under  F.)  (3) 

2 

Fmpirical  studies  have  shown  that  the  x*^  distribution  is  a more  accurate  approximation 

> 

to  the  exact  null  distribution  of  X”  when  equiprobable  cells  are  employed. 

Mann  and  Wald  then  made  recommendations  on  the  number  M of  equiprobable  cells 
to  be  used.  Their  work  rests  on  large-sample  approximations  and  on  a somewhat 
complex  minimax  criterion,  so  that  it  is  at  best  a rough  guide  in  practice.  Mann 
and  Wald  found  that  for  a sample  of  size  n (large)  and  significance  level  u,  one 
should  use  approximately 


i k 


M = 4( 

c(a)" 


(3.7) 


where  c(a)  is  the  upper  u-point  of  the  standard  normal  distribution.  The 
optimum  is  quite  broad.  In  particular,  the  M of  (3.7)  can  be  halved  with 
little  effect  on  power.  Retracing  the  Mann-Wald  calculations  using  better 
approximations,  as  in  Schorr  (1974),  confirms  that  the  "optimum"  M is  smaller 
than  the  value  given  by  (3.7).  Since  the  exact  optimum  depends  on  the  criterion, 
a choice  of  error  probabilities,  and  of  course  on  the  assumption  that  the 
hypothesized  F contains  no  unknown  parameters,  the  practitioner  need  not  go 
beyond  the  following  recommendation.  For  n 50,  choose  £ number  M of 
equiprobable  cells  fal 1 ing  between  the  value  (3.7)  for  = 0.05  and  half  that 
value.  This  recommendation  is  not  an  endorsement  of  the  use  of  a * 


-lbs 


0.05  in  tests 
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of  fit.  Because  (3.7)  increases  slowly  with  a,  but  overstates  the  number  of 
cells  required,  the  value  for  a = 0.05  can  also  be  used  when  larger  significance 
levels  are  in  mind. 

For  n < 50,  computation  and  simulation  suggest  that  the  recommendations 
above  remain  reasonable,  even  though  their  theoretical  base  in  asymptotic  theory 
is  no  longer  valid.  For  small  sample  sizes,  the  question  of  the  accuracy  of  the 
approximations  to  the  null  distribution  of  the  Pearson  statistic  becomes  more 
prominent  and  has  traditionally  influenced  the  choice  of  M relative  to  n.  The 
availability  of  inexpensive  computing  power  has  led  to  extensive  study  of  this 
issue  since  Cochran  (1954)  gave  the  commonly  accepted  rule  of  thumb.  Cochran's 
rule  was  that  all  expected  cell  frequencies  np^  should  be  at  least  1,  with  at 
least  80  percent  being  at  least  5.  Two  papers  which  summarize  more  recent  work 
are  Roscoe  and  Byars  (1971),  a simulation  study,  and  Good,  Cover  and  Mitchell 
(1970),  which  is  based  on  computation  of  the  exact  distribution.  It  is  notable  that 
current  recommendations  are  stated  in  terms  of  the  average  expected  cell  frequency 
rather  than  in  terms  of  the  minimum  expected  frequency. 

Here  are  the  findings  of  Roscoe  and  Byars,  which  may  serve  as  a guide  for 
practitioners . 

(i)  With  equiprobable  cells,  the  average  expected  cell  frequency 
should  be  at  least  1 (that  is,  n ^ M)  when  testing  fit  at  the 
a * 0.05  level;  for  a = 0.01,  the  average  expected  frequency 
should  be  at  least  2 (that  is,  n > 2M) . 


(ii)  When  cells  are  not  approximately  equiprobable,  the  average 
expected  frequencies  in  (i)  should  be  doubled. 

(iii)  These  recommendations  apply  when  M ^ 3.  For  M = 2 (1  degree  of 

freedom),  the  chi-square  test  should  be  replaced  by  the  test 
based  on  the  exact  binomial  distribution. 


1 

I 

Even  guideline  (ii)  is  satisfied  whenever  Cochran's  rule  is  satisfied, 
and  so  is  strictly  less  restrictive.  The  cl.i-square  test  with  x"  critical 
{HJints  has  a true  o higher  than  thenominal  a when  the  guidelines  (i)  and  (ii) 
are  not  met.  Roscoe  and  Byars  considered  only  a ■ 0.J5  and  a =•  0.01, 
whereas  tests  of  fit  preliminary  to  other  statistical  procedures  often  use 
a ^ 0.J5  or  similar  levels.  Since  the  x~  approximation  seems  least  accurate 
in  the  tails,  it  appears  that  rule  (i)  is  adequate  for  such  larger  values  of 
a.  It  should  be  noted,  however,  that  simulation  and  analytic  approximations  both 
Suggest  that  using  the  maximum  number  of  equiprobable  cells  (M  = n)  allowed 
by  guideline  (i)  results  in  a test  with  less  power  than  tests  having  fewer 
cells,  against  all  but  very  short-tailed  alternatives.  Since  the 
Mann-Wald  suggestion  (3.7)  falls  within  the  Roscoe-Byars  guidelines,  we  can 
reaffirm  the  recommendations  underlined  above. 

Recommendations  in  the  composite  case  are  less  easily  made.  Both 

theoretical  and  empirical  results  suggest  that  the  choice  of  M depends  on 

the  particular  hypothesized  f.unily,  on  the  method  by  which  the  unknown 

parameters  are  estimated,  and  on  the  alternatives  we  wish  to  detect.  The 

degree  of  arbitrariness  is  greatly  reduced  by  using  data-dependent  cells 

which  arc  equiprobable  under  the  estimated  parameter  values.  I’his  is 

possible  when  the  hypothesized  family  of  distributions  has  only  location 

and  scale  parameters,  as  is  illustrated  by  Example  1 in  Section  3.2.4  and 

discussed  in  Section  3.3.1.  The  most  thorough  study  to  date  of  the  effect 

of  M on  the  power  of  a chi-square  test  in  the  composite  case  is  Oahiya  and 

llurland  (iy73).  I'hey  investigated  the  test  of  univariate  normality  using 
> 

\‘'(0|^),  the  I’earson  statistic  witli  parameters  estimated  by  the  raw  data  Ml.li, 
and  data-dependent  cells  equiprobable  under  the  estimated  parameter  values 


i 
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tor  sample  size  50  and  100.  Against  some  alternatives  (.double  exponential. 


logistic),  power  decreases  as  M increases,  so  that  M = .^  is  optimal.  This 
surprisijig  result  does  ^ot  hold  ior  the  I'earson-l'is.  or  statistic  X^{0^),  or 
for  the  Rao-Robson  statistic  which  we  will  recommend  in  Section  3.3.2.  For 
alternatives  less  close  to  the  normal  family,  a number  of  cells  roughly  half 
that  specified  by  (.5.7)  gave  the  highest  power. 

The  examples  in  this  chapter  will  use  (5.7)  for  a = 0.05  as  a guide  in 
choosing  M.  This  avoids  subject iv i t>  , and  in  the  author's  experience  results 
in  greater  sensitivity  than  the  calculations  of  Oahiya  and  Gurland  suggest. 

Ihere  is  some  evidence  that  an  M half  this  sice  may  give  slightly  better  power. 

3.^.4  Lxamples  Of  The  Pearson  Test 

Because  of  its  relative  lack  of  power,  X‘  cannot  be  recommended  for  testing  fit 

to  standard  distributions  for  which  special-purpose  te..ts  are  available,  or  for  which 

the  special  tables  of  critical  points  needed  to  apply  tests  based  on  the  empirical 

distribution  function  (.Kill)  when  parameters  are  estimated  have  been  computed. 

iesting  fit  to  the  family  (3.5)  is.  on  the  other  hand,  a realistic  application  of 

■> 

the  Pearson-Fisher  statistic  . Indeed,  only  chi-square  tests  allow  solution 

of  this  problem  using  tabled  critical  points.  The  examples  below  of  X^  applied  to 
the  NOR  data  set  are  intended  onl\  as  illustrations  of  the  mechanics  of  applying 
the  test . 

EXAMPLE  1.  Since  NOR  purports  to  be  data  simulating  a normal  sample 
with  M = 100  and  a = 10,  let  us  first  assess  the  simulation  by  testing  fit 
to  this  specific  distribution.  The  Mann-Wald  recipe  (3.7)  with  a = 0.05 
and  n = 100  gives  M = 24 . For  computational  convenience,  we  use  M = 25  cells 
chosen  to  be  equiprobable  under  N(100,100).  The  cell  boundaries  are 
100  ♦ lOZj^,  where  is  the  0.04i  point  from  the  standard  normal  table, 

i = 1,2,..,, 24.  For  example,  the  0.04  point  is  -1.75,  so  the  upper 
boundary  of  the  leftmost  cell  is  100  ♦ (10) (-1.75)  = 82.5.  Table  3.1 
shows  the  cells  and  their  observed  frequencies.  The  expected  frequencies  are 
ft  1 1 r 1 001  ( 0 04 1*4  WK  An  T>  a 1 /M  all  wa  Hatr  a 
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X2  . M J (N.-  1)^ 
1=1 


So  in  this  example, 

„2  1 


X = i y (N.-4)“ 
4 1 


The  appropriate  distribution  is  x'^(24),  and  the  P-value  (attained  significance 
level)  of  X"  = 28  is  0.260. 

To  test  the  NOR  data  for  fit  to  the  family  of  univariate  normal  distributions, 
an  intuitively  reasonable  procedure  is  to  estimate  vj,o  by  X,o  and  use  cells  with 

boundaries  X ♦ z^o,  where  z^  are  as  before.  These  cells  are  equiprobable  under 

the  normal  distribution  with  u = X and  o = o.  It  will  be  remarked  in  Section  3.3 

that  the  Pearson  statistic  with  these  data-dependent  cells  has  the  same  large 

sample  distribution  as  if  the  fixed  cell  boundaries  100  + which  the  random 

2 

boundaries  coverage  were  used.  This  distribution  is  not  y (24),  since  and  o 

were  estimated  by  their  raw  data  MLE's  X and  o in  computing  the  cell  probabilities 

p.(X,o)  = 0.04.  The  appropriate  distribution  has  the  form  (3.6),  so  that  its 

critical  points  fall  between  those  of  x^(24)  and  x^(22) . Calculation  shows  that 

X = 99.54  and  c = 10.46.  The  cell  boundaries  X + oZj^  and  the  observed  cell 

frequencies  are  given  at  the  right  of  Table  3.1.  The  observed  chi-square  value 
2 

is  X = 22,  reflecting  the  somewhat  better  fit  when  parameters  are  estimated  from 

2 2 

the  data.  The  P-value  falls  between  0.460  (from  y (22))  and  0.579  (from  y (24)). 

For  comparison,  the  same  procedure  was  applied  to  test  the  LOG  data 
set  for  normality.  In  this  case,  X = 99.84  and  o = 16.51,  and  the  observed 
chi-square  value  using  cell  boundaries  X + oZj^  is  X^  = .31.5.  The  corresponding 
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t 

P-value  lies  between  0.08b  (from  and  0.140  (from  x"(24)).  Thus  this  j 

test  has  correctly  concluded  that  NOR  fits  the  normal  family  well,  while  the 

fit  of  LOG  is  marginal.  Since  the  logistic  distributions  are  difficult  to 

distinguish  from  the  normal  family,  this  is  a pleasing  performance.  In 

•> 

contrast,  the  same  procedure  with  M = 10  has  X"  = 9.4  for  the  LOG  data,  so 
that  the  P-value  lies  between  0.225  (from  x*'(^))  and  0.402  (from  x^(9)). 

Using  3 cells  gives  X*"  = 0.98  and  again  fails  to  suggest  that  the  LOG 

data  set  is  not  normally  distributed.  Thus  for  these  particular  data,  the 
larger  M suggested  by  (3.7^  produces  a more  sensitive  test. 

l-XAMPLK  2.  The  same  procedure  can  be  applied  to  the  FMEA  data,  but 

a glance  shows  that  these  data  as  given  are  discrete  and  therefore  not  normal. 

Indeed,  with  15  cells  equiprobable  under  the  N(X,o)  distribution  for  these 
■) 

data,  X"  = 554.  Since  the  data  are  grouped  in  classes  centered  at  integers, 

a more  intelligent  procedure  is  to  use  fixed  cells  of  unit  width  centered 

at  the  integers,  with  cell  probabilities  computed  from  N(X,a).  Of  course, 

X and  o from  the  grouped  data  are  only  approximate.  Sheppard's  correction 

for  o improves  the  approximation,  and  gives  X = 14.540  and  o = 2.216. 

Calculating  the  cell  probabilities  and  computing  the  Pearson  statistic,  we 

obtain  X^  = 7.56.  The  P-value  lies  betweei\  0.819  (from  x^(12))  and  0.911 

(from  x^(l'*)).  so  that  the  EMI-A  data  fit  the  normal  family  very  well  indeed. 

2 

The  applicability  of  X to  grouped  data  such  as  these  is  an  advantage  cf 
chi-sqviare  methods. 

3.2.5  Recommendations  For  Use  Of  The  Pearson  Statistic 

(1)  When  the  raw  data  MLE  0^^  is  computationally  simpler  than  the 

grouped  data  estimator  0^,  do  not  hesitate  to  use  8^.  The  critical 

2 - 2 2 
points  of  X (Oj^)  fall  between  those  of  x (M-1)  and  x (M-p-1),  and 
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simulation  sugnosts  that  X'ce^^)  is  usually  more  powerful  than  the 
Pearson-Fisher  test  haseil  on  and  x (M-p-1)  critical  points. 

(2)  When  testing  fit  ti»  a locat  ion-scale  family  {!■(•|o)},  use  cells  which 
are  ec|uiprobab le  under  the  estimated  value  of  0.  The  fact  that  these 
cells  arc  data-dcpendent  does  not  affect  the  distribution  theory,  as 
Section  3..X.1  discusses  more  fully. 

2/5 

(5)  Choose  the  number  M of  eipiiprobablo  cells  to  be  approximately  2n 

IThis  is  based  on  the  discussion  in  Section  X.2..^.  Half  the  Mann-Wald 

2/S 

recipe  (.>.7)  tor  u = O.dS  is  1.9n  .) 


r 


3.3  C.tiNl-RM.  CHl-SQUARl!  STATISTICS 
3.3.1  IhK a-dcpendcnt  C ej  l_s 

As  already  noted  in  Section  3.2.4,  the  use  of  data-dependent  cells  increases 
the  flexibility  of  chi-square  tests,  fortunately  without  increasing  their  complexity 
in  practice.  The  essential  reipii rement  is  that  as  the  sample  size  increases,  the 
random  cell  boundaries  must  converge  ^ prob.ibility  to  a set  of  fixed  boundaries . 

Tlie  limiting  cells  will  usually  be  unknown,  since  they  depend  on  the  true  parameter 
value  Random  cells  are  used  in  chi-square  tests  by  "forgetting"  that 

I 

the  cells  are  data-depondent  and  jiroceeding  as  if  fixed  cells  had  been 
chosen.  Since  the  cell  frequencies  are  no  longer  multinomial,  the  theory 

of  such  tests  is  mathematically  difficult.  But  in  practice,  the  limiting  ‘ 

distribution  of  X with  random  cells  is  exactly  the  same  as  if  the  limiting 
fixed  cells  had  been  used.  This  is  true  even  when  parameters  are  estimated. 

Details  and  regularity  conditions  appear  in  Section  4 of  Moore  and  Spruill 

2 - 

(1975).  Therefore,  any  statistic,  such  as  the  Pearson-Fisher  X (9j^) f 
which  has  a B^-free  1 imit ing  null  distribution  using  fixed  cells,  has  that 
same  1 imit  ing  mil  1 distribiit  ion  for  any  choice  of  converging  random  cel  Is. 

A statistic  such  as  the  Chernoff- Lehmann  X^(O^)  which  has  a Oy-dependont 
limiting  null  distribution  for  fixed  cells,  has  in  general  this  same 
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I 

y 

1 

i 

[ 

f 


deficiency  with  random  cells.  But  if  the  hypothesized  family  {F(*|0)} 

is  a location-scale  family,  a proper  choice  of  random  cells  eliminates  this 

0Q-dependency  and  also  allows  cells  to  be  chosen  equiprobable  under  the 

estimated  0,  thus  matching  the  recommended  practice  in  the  simple  hypothesis 

case.  Such  cell  choices  should  be  made  whenever  possible.  Theorem  4.3  of 

Moore  and  Spruill  (1975)  is  a general  account  of  this.  Let  us  here  illustrate 

2 

it  by  returning  to  the  X statistic  for  testing  univariate  normality. 

When  the  parameter  0 = (a, a)  is  estimated  by  0^  = (X.o)  and  cell 
boundaries  X + z^o  are  used,  the  estimated  cell  probabilities  are 


p. (X,o) 


X + z .0 
1 

/ 

X . z._jc 


(2tt5") 


,2,-l/2^-(t-X)2/2^2^ 


(2u) 


-1/2  -u^/2 
e 


z 


i-1 


du 


These  are  not  dependent  on  (X,a),  and  are  equiprobable  if  z^  are  the 
successive  i/M  points  of  the  standard  normal  distribution.  Since  this  choice 
of  cells  leaves  both  and  p^  unchanged  when  any  location-scale  transformation 
is  applied  to  all  observations  Xj , the  Pearson  statistic  has  the  same 
distribution  for  all  (p,a).  The  limiting  null  distribution  has  the  form 
(3.6)  but  the  are  now  free  of  any  unknown  parameter.  Critical  points  may 
therefore  be  computed.  Two  methods  for  doing  so,  and  tables  for  testing 
normality,  appear  in  Dahiya  and  Gurland  (1972)  and  Moore  (1971).  Dahiya  and 
Gurland  (1973)  study  the  power  of  this  test.  The  idea  of  using  random 
cells  in  this  fashion  is  due  to  A.  R.  Roy  (1956)  and  G.  S.  Watson  (1957,  1958, 
1959).  We  will  refer  to  the  Pearson  statistic  using  the  raw  data  MLE  and 
random  cells  as  the  Wdt&on-Roy  itatilitic.’  Example  1 in  Section  3.2.4 
illustrated  its  use. 


! 


Jj 


lb 

Note  that  the  Watson-Roy  statistic  has  0-free  limiting  null  distribution 
only  for  location-scale  families,  that  this  distribution  is  not  a standard 
tabled  distribution,  and  that  a separate  calculation  of  critical  points  is 
required  for  testing  fit  to  each  location-scale  family.  These  statements 
are  also  true  for  EDF  tests  of  fit.  Since  the  latter  are  more  powerful,  the 
Watson-Roy  statistic  has  few  advantages  when  F('|0)  is  univariate  and 
continuous.  Nonetheless,  data-iiependent  cells  move  the  cells  to  the  data  without 
essentially  changing  the  asymptotic  distribution  theory  of  the  chi-square 
statistic.  They  should  be  routinely  employed  in  practice,  and  this  is  done  in 
most  of  the  examples  in  this  chai'ter. 

3.3. 2 General  Quadratic  Forms 

Some  of  the  most  useful  receiit  work  on  chi-square  tests  involves  the  study  of 
quadratic  forms  in  the  standardized  cell  frequencies  other  than  the  sum  of  squares 
used  by  Pearson.  Random  cells  are  commonly  recommended  in  these  statistics,  for  the 
reasons  outlined  in  Section  3.3.1,  and  do  not  affect  the  theory.  A statement  of 
the  nature  and  behavior  of  these  general  statistics  of  chi-square  type  is  necessarily 
somewhat  complex.  Practitioners  may  find  it  helpful  to  study  the  examples  computed 
in  Section  3.3.3  and  in  Rao  and  Rolison  (.1974)  before  approaching  the  summary 
treatment  below. 

Random  cells  should  be  denoted  by  , . . . , X^)  in  a precise  notation,  but  here 

the  notation  for  cells  and  N.  for  cell  frequencies  will  be  continued.  The  "cell 
probabilities"  under  F(-|o)  are 

p (0)  = / dF(x|o)  i = 1 M. 

E. 

1 

Denote  by  V^(0)  the  M-vector  of  standardized  cell  frequencies  having  ith  component 

1/2 

|Nj  -n|).  ( 0)  |/(niK  (0) ) ' . 
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i 

i 

j 
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1 

i 

j 

1 
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If  X^)  is  a possibly  data-dependent  MxM  s.vTnmetric  nonnegative 

definite  matrix,  the  general  form  of  statistic  to  be  considered  is 

when  0 is  estimated  by  9^^.  The  Pearson  statistic  is  the  special  case  for 

which  the  MxM  identity  matrix.  The  large-sample  theory  of  these 

statistics  is  given  in  Moore  and  Spruill  (1975).  The  basic  idea  is  that  of 

Pearson's  proof:  To  show  that  as>Tnptotically  multivariate  normal 

(even  with  random  cells)  and  then  apply  the  distribution  theory  of  quadratic 

forms  in  multivariate  normal  random  variables.  All  statistics  of  form  (3.8)  have 

as  their  limiting  null  distribution  that  of  a linear  combination  of  independent 

chi-square  random  variables.  References  on  the  calculation  of  such 

distributions  may  be  found  in  Davis  (1977). 

To  avoid  the  necessity  to  compute  special  critical  points,  it  is 

advantageous  to  seek  statistics  (3.8)  which  have  a chi-square  limiting  null 
distribution.  This  idea  is  due  to  D.  S.  Robson.  Rao  and  Robson  (1974) 


treat  the  important  case  of  raw  data  MLE's.  They  give  the  quadratic  form 
in  V^(0^)  having  the  x^(M-l)  limiting  null  distribution.  The  appropriate 
matrix  is  where 

Q(0)  = + B(0)[J(e)-B(9)'B(0)]'^B(0)', 

J(0)  is  the  pxp  Fisher  information  matrix  for  F("|0),  and  B(0)  is  the  Mxp 
matrix  with  (i,j)th  entry 


Pi(e) 


.l/2  3Pi(0) 


The  Rao-Robion  Atatutic  is 


R = V (0  )'Q(0  )V  (0  ). 
n n*-  n n'  n''  n^ 


This  test  can  be  used  whenever  J-B'B  is  positive  definite.  Since 

nJ  is  the  information  matrix  from  the  raw  data  and  nB'B  the  information 

matrix  from  the  cell  frequencies,  J-B'B  is  always  nonnegative  definite. 

2 “ 

Notice  that  is  just  the  Pearson  statistic  X (0^^)  plus  a term  which 
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I 


conceptually  builds  up  the  distribution  (3.6)  to  x^(M-l). 
simplifies  considerably,  since  I^ap^/30j  - 0 implies  that 


V 

n 


•B  = n-1/2 


M 


( 1 

i=l 


N. 

1 


^Pi 

^ 


M ^ ^ 
i=l  Pi 


This  term 


(3.9) 


and 

Rn  = X^(e^)  + (V'^B)(J-B'B)'Vv'^3)',  (3.10) 

all  terms  being  evaluated  at  0 = 0^^.  Further  simplification  can  be 
achieved  in  location-scale  cases  by  the  use  of  random  cells  for  which 

Pi^^n^  * 1/M.  Rao  and  Robson  (1974)  give  several  examples  of  the  use  of  this 

statistic,  using  random  cells  in  some  cases. 

Simulations  by  Rao  and  Robson  show  that  R^^  has  generally  greater  power 

than  either  the  Pearson-Fisher  or  Watson-Roy  statistics.  Spruill  (1975)  gives 

a theoretical  treatment  showing  tliat  R^^  dominates  the  Watson-Roy  statistic 

for  any  location-scale  family  {F("[0)}.  Since  R^  is  powerful,  has  tabled 

critical  points,  and  is  easy  to  compute  whenever  the  MLE  0^  can  be  obtained, 

it  is  recommended  as  a standard  chi-square  test  of  fit . Moore  (1977) 

2 

gives  a general  recipe  for  the  quadratic  form  having  the  x (M-1)  distribution 
when  nearly  arbitrary  estimators  0^^  are  used.  The  idea  parallels  Pearson's 
proof,  using  a generalized  inverse  of  the  covariance  matrix.  The  Pearson- 

A 

Fisher  and  Rao-Robson  statistics  are  the  0 and  0 special  cases  of  this 
recipe,  which  is  the  Watd'A  mzthod  itatlitic. 

If  (3.6)  can  be  built  up  to  x^(M-l),  it  can  also  be  chopped  down  to 
X^(M-p-l).  Dzhaparidze  and  Nikulin  (1974)  point  out  that  the  appropriate 
statistic  is 

Z„(V  • V„(I„-B(B'B)-'b’)V„ 

' 2 

where  V and  B are  evaluated  at  0 = 0 . Z has  the  x (M-p-1)  limiting 
nn  nn  AVI'/  & 

distribution  whenever  0^  approaches  0^^  at  the  usual  n^^^  rate,  and  can 
tlu'rcft)re  i)c  used  with  any  reasonable  estimator  of  o.  tlomputation  of 
is  again  simplified  by  (3.9).  As  might  be  expected,  simulations  suggest 


! 

i 


. J 


19 


that-Z^CQj^)  is  inferior  in  power  to  both  the  Watson-Roy  and  Rao-Robson 
statistics . 

3.5.3  Examples  Of  General  Chi-Square  Tests 

EXAMPLE  1.  It  is  desired  to  test  fit  to  the  negative  exponential  family 
f(x|03  = , 0 < X < “ 

where  0 = {0:  0 < 0 < <»}.  Since  the  MLE  of  0,  0 = X,  is  available,  the 

n 

Rao-Robson  statistic  is  the  recommended  chi-square  test.  When  p * 1,  (3.9) 
and  (3.10)  reduce  to 

M (N.-np.)"’  M N dp 

"n  ■ i,-np.  -in;  'T.pTdsr) 

1=1  e,  I=r  1 


where 


r 1 ^^i  2 

D = J - y i— (-^)^ 

iiiPi  " 

and  J,  p. , dp./de  are  all  evaluated  at  0 = 0 . For  a sample  of  size  n = 100. 

11  n • 

we  will  once  more  use  M = 25  equiprobable  cells.  In  this  scale-parameter  family, 

equijirobable  cells  are  achieved  by  the  use  of  random  cell  boundaries  of  the  form  2 X 

i 

1-  rom 

z.X 


Pi(0)  = / _9 


•1  -x/0. 


(3.11) 


^i.lX 


the  condition  p.(X)  3 1/25  gives  z„  = 0,  z-;.  = “ and 


z.  = - log  (1-^) 


i = 1,...,24. 


Differentiating  (3.11)  under  the  integral  sign,  then  subsituting  0 = X,  gives 

■ V./X. 

Because  of  their  iterative  nature,  the  quantities,  v.  are  easily  computed  on  a 
programmable  calculator.  The  Fisher  information  is  .1(0)  = 0 ^ so  that 
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TABLE  3.2 


The 

Rao-Robson 

test 

for  the 

negative 

exponential  family. 

with 

25  equiprobable 

WE  2 

EXP 

i 

z . 

V. 

z.X 

N. 

z.X 

1 

1 

1 

1 

1 

1 

1 

.0408 

-.0392 

0.036 

1 

0.221 

6 

2 

.0834 

-.0375 

0.073 

0 

0.451 

5 

3 

.1278 

-.0358 

0.112 

1 

0.692 

3 

4 

.1743 

-.0340 

0.153 

1 

0.944 

2 

5 

.2231 

-.0321 

0.196 

3 

1.208 

5 

6 

.2744 

-.0301 

0.241 

1 

1.486 

5 

t 

7 

.3285 

-.0279 

0.288 

2 

1.779 

7 

i 

8 

.3857 

-.0257 

0.338 

3 

2.088 

2 

! 

9 

.4463 

-.0234 

0.392 

5 

2.416 

4 

i 

10 

.5108 

-.0209 

0.448 

5 

2.766 

3 

I 

11 

.5798 

-.0182 

0.509 

1 

3.140 

3 

12 

.6539 

-.0153 

0.574 

5 

3.541 

4 

i 

13 

.7340 

-.0123 

0.644 

3 

3.974 

6 

: ' 

14 

.8210 

-.0089 

0.721 

5 

4.445 

3 

, 

15 

.9163 

-.0053 

0.804 

8 

4.962 

4 

1 

16 

1.0216 

-.0013 

0.897 

4 

5.532 

4 

17 

1.1394 

.0032 

1.000 

16 

6.170 

3 

18 

1.2730 

.0082 

1.118 

9 

6.893 

3 

! 

19 

1.4271 

.0139 

1.253 

11 

7.728 

4 

1 

20 

1 . 6094 

.0206 

1.413 

7 

8.715 

2 

1 

21 

1.8326 

. .0287 

1.609 

5 

9.923 

7 

i 

22 

2.1203 

.0388 

1.861 

1 

11.481 

3 

1 

23 

2.5257 

.0524 

2.217 

3 

13.676 

3 

1 

1 

24 

3.2189 

.0733 

2.826 

0 

17.430 

6 

25 

00 

.1288 

00 

0 

00 

3 

F inal ly 


25 

D - X'^[l-25  I vf] 
i=l 


I f (N  -4)2  . 

4 100  , ^^r25  2 ‘ 

1=1  1-25)  V. 

1 1 


Table  3.2  records  z.  and  v.,  from  which 
1 1 

l-25j;2^v2  = 0.  04255. 

For  the  WE2  data  set,  X = 0.878.  The  resulting  cell  boundaries  and  cell 
frequencies  appear  in  Table  3.2,  and 

R . 1(351)  . lilll  iiO^OSigf 

*^100  4'-  ''  100  0.04255 

= 87.75  + 0.40  = 88.15 

-9  2 

Tins  gives  a P-value  of  3 x 10  using  the  x distribution.  In  contrast, 

the  EXP  data  set  has  X = 5.415,  cell  boundaries  and  frequencies  given  at  the 


right  of  Table  3.2,  and 


= t(54)  ^ 


100  4 


(25)^  (-0-1231)- 
100  0.04255 


= 13.5  + 2.23  = 15.73. 

2 

The  P-value  from  x (24)  is  0.898. 

2 ^ 

As  these  examples  suggest,  the  Pearson  statistic  X (0  ),  which  is  the 

n' 

first  component  of  R^,  is  usually  adequate  for  drawing  conclusions  when  M 

2 * 

is  large  and  p is  small.  In  this  example,  the  critical  points  of  X (9^) 

2 2 

tall  between  those  of  x (22)  and  those  of  x (24).  A reasonable  strategy  is 
2 * 

to  compute  X (6^)  first,  completing  the  computation  of  only  if  the 
results  after  the  first  stage  are  ambiguous. 

EXAMPLE  2.  The  BAEN  data  are  to  be  tested  for  fit  to  the  double- 


exponential family 
f(x|0)  = 


1 - x-0,  /0. 


00  < X < “> 


® Oj  < 0 < 0^  < ®}. 
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The  MLF.  9 = (0,  ,0-  ) from  a random  sample  X,  , 

n In  2n  ^ 1 


..Xn  is 


9,  = median  (X, X ) 

In  1 n 

A 1 ^ ^ 

0,  = - I |x.-o,  I . 

2n  = i j 


In  this  location-scale  setting,  equiprobable  cells  with  boundaries 
0,  + a.0^  will  again  be  emploved.  Using  an  even  number  of  cells,  say 

M ° 2v,  and  choosing  the  a^^  svaiuiiet vical ly  as  j = c.,  where 

= -log(l-^)  i = 0, . . . ,v 

(in  particular,  aQ  = a^  = 0,  a^^  = «)  gives  P^(0j^)  = 1/M. 

Computations  similar  to  those  shown  in  Example  1 yield 


or  i ' 

-T^(0  ) = -1/M0^ 

3 0^  ^ n-*  2n 


= 1/MO. 


i = 1 , , . . ,v 


i = V + 1 M 


^Pi  “ , 1 , ■'•'k-l  , , , 

3r;fV  = 2^f^'k-i"  -V  ^ ^ 

2 k=l V 


-c.  . -c. 

If  d,  = c,  ,e  "^-c.e  , then 

k k-1  k 


1 0 


(3.12) 


B(0  ) 'B(0  ) = 0-  I , I 

2n  \ „ V''j2  I 

Y '^Ildi/. 

« A A 

Since  the  information  matrix  is  02^12’  niatrix  d(6jj)  "8(6^) 'B(0j^) 

has  rank  1 and  the  Rao-Robson  statistic  is  not  defined.  (The  reason  for  this 

unusual  situation  is  that  for  this  choice  of  cells,  the  median  is  both  the 
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raw  data  MLE  and  the 
statistic  is 


Z (0  ) 
n 


grouped  data  MLE 


- ! (N.  - S)‘ 

n 1 


i=l 


for  6^.)  The  Dzhaparidze-Nikulin 


M 


[ y d.  (N  . ♦ N 
1 v>i  \ 


i»l 


•i+1 


)]■ 


This  computation  was  simplified  by  the  fact  that  B'B  is  diagonal  and  the 
first  term  of  (3.9)  is  0 by  (3.12)  and  the  definition  of  the  median. 

The  BAEN  data  contain  n = 33  observations,  for  which 

A 

9,  = 10.13  and  0,  = 3.36.  Table  3.3  contains  c.,  upper  cell  boundaries 

0,  + C.0-  , and  cell  frequencies  for  these  data.  The  statistic  Z is, 

In  1 2n  ^ n 


after  some  arithmetic, 

"n  = if 

= 7.30-1.59  = 5.71 
2 

The  P-value  from  x (7)  is  0.426.  The  Pearson 


TABLE  3.3 

Testing  the  fit  of  the  BAEN  data 
to  the  double  f ._;x)nential  family 


Cell 

c . 

1 

0.  + c . 0 , 

In  1 2n 

N. 

1 

1 

-1.609 

4.722 

4 

2 

-0.916 

7.051 

7 

3 

-0.511 

8.414 

3 

4 

-0.223 

9.380 

2 

5 

0 

10.130 

1 

6 

0.223 

10.880 

3 

7 

0.511 

11.846 

4 

8 

0.916 

15.209 

3 

9 

1.609 

15.538 

4 

10 

oo 

00 

■> 

statistic  » 7.30  has  critical  jHJints  falling  between  those  of  x^(7)  and 


x“(8),  taking  advantage  of  ftio  fact  that  the  grouped  data  MLE  was  used  to 

estimate  one  of  the  two  imknown  parameters.  The  corresponding  bounds  on 

the  P-value  are  0.398  and  0.505.  The  double  exponential  model  clearly 

fits  the  BAEN  data  very  well.  Even  though  an  anomaly  reduced  from  2 to  1 

the  difference  in  the  degrees  of  freedom  of  the  x"  distributions  bounding 

X',  there  is  a considerable  spread  in  the  corresponding  P-values.  This  is 

t.vpical  when  n (and  therefore  M)  is  small.  In  e.xamples  where  the  goodness 

of  fit  is  less  clear  than  here,  use  of  R or  Z can  be  essential  to  a clear 

n n 

conclusion. 

Nonstandard  Chi-Square  Statistics 

j The  class  of  itamLwA  c/u-iqua'ic  itatxiticb  is  composed  of  all  nonnegative 

j liefinite  quadratic  forms  in  the  standardized  cell  frequencies,  with  possibly  estimated 

j parameters  and  data-dependenl  cells.  Such  statistics  have  a unified  large- 

sample  theory  given  by  Moore  and  Spruill  (1975).  Other  classes  of  statistics 
are  less  wel 1 . explored  but  may  hold  promise.  A few  are  mentioned  here. 

Since  none  can  yet  compete  with  standard  statistics  in  practice,  this 
section  can  be  considered  optional  reading. 

(a)  Increasing  M with  n.  St.indard  statistics  assiune  that  the  number 
of  cells  M remains  fixed  as  the  sample  size  n increases.  The  usual  practice 
is  to  use  more  cells  as  n increases  (.recal  1 the  Mann-Wald  suggestion  (3.7)),  yet  this 
practice  is  not  explicitly  recognized  in  the  theory  of  standard  chi-square  statistics. 
Kompthorne  (19b8)  proposed  the  use  of  the  Pearson  statistic  with  M = n equiprobable 
cells.  Such  statistics  have  a large-sample  theory  very  different  from  that 
of  standard  statistics.  For  the  case  of  testing  fit  to  a completely 


i 
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specified  distribution,  Morris  (1975)  shows  that  the  Pearson  statistic 
has  a normal  limiting  null  distribution  in  some  generality  when  M increases 
with  n.  The  behavior  of  such  statistics  when  parameters  must  be  estimated 
is  largely  unexplored.  Simulation  studies  of  Kempthorne's  statistic  suggest 
that  standard  statistics  with  fewer  cells  have  superior  power  except  against 
very  short-tailed  alternatives. 

(b)  Sequentially  adjusted  cells.  By  use  of  the  conditional  probability 

integral  transformation  (see  Chapter  6),  O'Reilly  and  Quesenberry  (1973) 

obtain  particular  members  of  the  following  class  of  nonstandard  chi-square 

tests.  Rather  than  base  cell  frequencies  on  cells  (fixed)  or 

(Xj...,X^)  (data-dependent)  into  which  all  of  Xj^,...,Xj^  are  classified,  the 

cells  used  to  classify  each  successive  X.  are  functions  E.  of  X, ,...,X. 

only.  Thus  additional  observations  do  not  require  reclassification  of 

earlier  observations,  as  in  the  usual  random  cell  case.  No  general  theory 

of  chi-square  statistics  based  on  such  sequentially  adjusted  cells  is  known. 

O'Reilly  and  Quesenberry  obtain  by  their  transformation  approach  specific 

functions  E^^  such  that  the  cell  frequencies  are  multiromially  distributed 

2 

and  the  Pearson  statistic  has  the  x (M-1)  limiting  null  distribution.  The 
transformation  approach  requires  the  computation  of  the  minimum  variance 
unbiased  estimator  of  F('|0).  Testing  fit  to  an  uncommon  family  thus 
requires  the  practitioner  to  do  a hard  calculation.  Moreover,  any  test 
using  sequentially  adjusted  cells  has  the  disadvantage  that  the  value  of  the 
statistic  depends  on  the  order  in  which  the  observations  were  obtained. 

These  are  serious  barriers  to  use. 


i : 


i i 


] r > 
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(c)  Kasterl iiiK ' s approach.  l-asterlinR  (1976)  provides  an  interesting 
approach  to  parameter  estimation  hascJ  on  tests  of  fit.  Roughly  speaking, 
he  advocates  replacing  the  usual  confidence  intervals  for  0 in  F(*|0) 
based  on  the  acceptance  regions  of  a test  of 


'V-  * ■ ®,. 


H,:  0 ,<  0„ 


with  intervals  based  on  the  accei'tance  regions  of  tests  of  fit  to  completely 
pec i f i ed  distributions, 


11^*:  G(-)  = I 


llj*:  G(.)  ^ l'(-  |0„)  . 


In  the  course  of  his  discussion,  Isistorl ing  suggests  rejecting  the  family 
(rcx|0):  0 in  SI)  as  a model  for  the  data  if  the  (say)  SOI.  confidence  interval 


for  0 based  on  acceptapcc  regions  for  is  cmiity.  This  "implicit  test 


of  fit"  deserves  comment,  using  the  chi-square  case  to  make  some  observations 


which  apply  as  w('ll  wlu'ii  other  tostsof  11^^*  are  ('inployed. 


i'aking  then  the  standard  chi  square  st;itistic  for 


, M |N  . -up. (01 

2,„  . r ni  * 1 t) 


X (0„)  = I 
1^1 


np.  f0|,1 


2 2 

and  denoting  by  (M-1)  the  iqqu'r  a-]ioint  of  the  x (M~l)  distribution,  the 


( 1 -u)-conf idence  interval  is  empty  if  and  only  if 

for  all  0 in  Q. 


X^(0)  > x,/(M-l) 


(3.13) 


Rut  if  0^^  is  the  mininuun  chi-square  estimator,  (3.13)  holds  if  and  only  if 


X^0n)  > X„^M-1) 


(3.14) 


r 
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1 2 - 2 

Whi'n  any  F(x|0)  Is  true,  X (0^^)  has  tho  x lM-m-1)  Uistrlbution,  and  the 
probability  of  the  event  (S.141  ran  be  explicitly  computed.  It  is  less 
than  a,  but  close  to  o when  M is  lar^je.  Thus  liasterl  injj ' s su^t^estion 
essentially  reduces  to  the  use  of  standard  tests  of  fit  with  parameters 
estimated  by  the  minimiun  distance  method  correspond injj  to  the  test  statistic 
employed.  Moreover,  his  method  by-passes  a projier  consideration  of  the 
distributional  effects  of  estimating  vinknown  parameters. 


3.4  Rl.COMMl.ND/Vl'lONS  AND  I'llKl'IUiR  liXAMIM.liS 


• 'jd-  Use  of  f^l^ts 


fhi-sijuare  tests  are  generally  less  |>oworful  than  I.Dl' 
tests  and  special-purpose  tests  of  fit.  It  is  difficult  to  assess  the  seriousness  of 
this  lack  of  power  from  published  sources.  i!om|)arat  ive  studies  have  tjenerally  used 
the  Pearson  statistic  rather  tinin  the  more  powerful  Watson-Roy  and  Rao-Robson  statist  ic 
Moreover,  such  studies  have  often  dealt  with  problems  of  parameter  estimation  in 
ways  which  tend  to  lutderstate  the  power  of  netteral  pur,iose  tests  such 
as  chi-square  and  Kolmogorov-Smirnov  tests.  This  is  true  of  the  study  by 
Shapiro,  Wilk  and  Chen  llDoS),  for  examj'le.  Reliable  information  about  the 
ix)wer  of  chi-square  tests  for  nonnality  can  be  Knitted  from  Table  IV  of 
Rao  and  Robson  (l‘)74)  and  from  I'ables  1 and  2 of  Dahiya  ai\d  lUirland  (1U73). 

The  former  demonstrates  strikingly  the  Rain  in  ivwer  (always  at  least  40\ 
in  the  cases  considered,  and  usually  much  greater)  obtained  by  abondoning 
the  Pearson-Fisher  statistic  for  more  nwdern  chi-square  statistics. 

Nonetheless,  chi-square  tests  cannot  in  general  match  I'.DF  and  special 


! 

purjHise  tests  of  fit  in  power. 


This  relative  lack  of  power  implies  throe  theses  on  the  practical  use 
of  chi-square  techniques,  first,  ,i>i- sepia  re  tests  of  fit  must  compete  for 
pti>n»rily  on  the  ha s i s f 1 ex ihil  ity  ^ni  ease  of  use.  Discrete 

;nul/or  multivariate  data  do  not  discomfit  chi-square  methods,  and  the 
necessity  to  estimate  unknown  parameters  is  more  easily  dealt  with  by  chi- 
square  tests  than  by  other  tests  of  fit. 

Secoiul,  chi-square  stjUi sties  actually'  haviin;  a (1  imitiuK)  chi-square 
nulj_  d ist  r ilnit  ion  have  ^ muc li  st  roiq^er  claim  to  pract  ical  usefulness . Ease 
of  use  requires  the  ability  to  obt.iin  (1)  the  observed  value  of  the  test 
statistic,  and  critical  points  for  the  test  statistic.  The  calculations 
required  for  (1)  in  chi-square  statistics  are  at  most  iterative  solutions 
of  nonlinear  equations  and  evaluation  of  quadratic  forms,  perliaps  with 
matri.x  expressed  as  the  inverse  of  a Riven  symmetric  pd  matrix.  These 
are  not  serious  barriers  to  practical  use.  Riven  tlie  current  availability 
of  computer  library  routines,  Comj'utat ion  of  critical  points  of  an 
unttibled  distribution  is  a much  harder  task  for  a user  of  statistical 
methods.  Chi-square  ami  I Df  st.itistics  both  have  as  their  limiting  null 
distributions  the  tlistribut ions  of  linear  combinations  of  central  chi-square 
random  variables,  lieneral  statistics  of  both  classes  require  a separate 
table  of  critical  points  for  each  hyqiothesized  family.  The  effort 
needed  is  justificil  whcti  the  hypothesized  family  is  common,  but  should  be 
expendoil  on  a test  more  powerful  than  chi-square  tests.  In  less  common 


t 
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cases,  or  when  no  more  powerful  test  with  9-free  null  distribution  is 
available,  there  are  several  chi-square  tests  requiring  only  tables  of  the 
x"  distribution.  These  include  the  Pearson-Fisher , Rao-Robson,  and 
lUhaparidze-Nikulin  tests,  and  others  which  can  be  constructed  by  the  method  of 
Moore  (1977).  Among  the  chi-square  statistics  proposed  and  studied  to  date, 
the  Rao-Robson  statistic  R^^  of  (3.10)  appears  to  have  generally  superior  power  and 
is  tlierefore  the  statistic  of  choice.  Computation  of  R^^  in  the  nonstandard  cases 
most  appropriate  for  chi-square  tests  of  fit  does  require  some  mathematical  work. 

However,  the  Pearson  statistic  with  raw-data  Midi's  is  the  first  and  usually 

2 ' 2 
dominant  component  of  R^^.  If  X (0^^)  itself  lies  in  the  upper  tail  of  the  y (M'l) 

distribution,  the  fit  can  be  rejected  without  computing  R^^ . 

The  third  thesis  rests  on  the  exposition  and  examples  in  this  chapter. 

Chi-square  tests  are  the  most  practical  tests  of  fit  in  many  situations. 

When  parameters  must  be  estimated  in  non- locat ion-scale  families  or  in 
imcommon  distributions,  when  the  data  are  discrete,  multivariate,  or  even 
(see  the  next  section)  censored,  chi-square  tests  remain  easily  applicable. 

3.4.2  Further  Exan^lcs 

i;hi-square  tests  should  not  be  used  for  testing  the  fit  of  full  ungrouped  samples 
to  common  univariate  distributions.  There  are  more  powerful  tests  available  in  such 
situations.  Yet  many  of  the  examples  given  have  concerned  such  situations.  This 
section  illustrates  the  flexibility  of  chi-square  methods  in  two  more  appealing  settings, 
one  multivariate  and  one  with  censored  data.  As  in  tlie  earlier  examples  of  this 
chapter,  the  required  numerical  calculations  are  easily  done  on  a programmable 
calculator. 

EXAMPLE  1.  The  circular  bivariate  normal  family  is  a common  model  for 
errors  in  "bombing"  a target.  It  represents  the  effect  of  independent 
normal  horizontal  and  vertical  components  with  equal  variances.  The 
density  function  is 
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■ {(-'-Pj)*'  + iy-V2)^^ 

f(x.yi0)  = “ e 

2ita 


- » < x,y  < 


il  =“  {9  = (vij,vi2.o);  - oo  < * “»0  < a < •}. 


ri\e  MI.K  of  0 from  a random  sample  > • • • > 

where 

Pj  = X ^2  " 

-T  1 a li  ,9 

V { y (x,-xr  I (Y.-Y)^}  . 

2n  -r^  ) j.,  J 

in  constructing  a test  of  fit  to  this  family,  it  is  natural  to  use  as  cells 
annuli  centered  at  (X,Y)  with  successive  radii  c^o  for 

^ = <^0  " ‘^l  " •••  " Vl  " = ”• 


rhus 


F.j  = {(x,y);  1 (x-X)'^  + (y-Y)*"  < c^  o }, 

The  cell  probabilities  are 

p.  (e)  = //f (x.yjeidxdy 

X .. 


2^2. 


f-;. 

I 


c.  » {-2  log 


and  calculation  shows  that 

i = 1,...,M-1. 

The  rt-cominended  test  is  based  on  the  Rao-Robson  statistic.  Differentiating 
p. (0)  under  the  integral  sign,  then  substituting  0 = 0^  gives 
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Hence 


(000 
0 0 0 
0 0 


The  Fisher  information  matrix  for  the  circular  bivariate  normal  family  is 


also  diagonal, 


1 0 0 


j(0)  = 0 1 0 . 

® \ 0 0 4 / 


so  that  ^ is  trivially  obtained.  Moreover,  from  (3.9)  it  follows 


^n'B  = n'^^^(0.0,^'j'N^v./o). 


The  Rao-Robson  statistic  is  therefore 


K = (V  'B)  (V  'B)' 

n 'n  'n  n 


« y ,N  a, 2 

" i-l  » “ * " i-hI”j.2 


where 


<*1  ■ 


The  limiting  null  distribution  is  x*^(M-l),  while  that  of  the  Pearson 

2 " 2 
statistic  X (9^p  has  critical  points  falling  between  those  of  x (M-1) 

2 

and  X (M-4).  The  Rao-Robson  correction  term  will  often  be  necessary  for  a 


clear  picture  of  the  fit  of  this  three-parameter  family. 

EXAMPLE  2.  The  negative  exponential  distribution  with  density  function 


f{x|0)  = 


i'J«{0:  O<0<“>} 


0 < X < 


is  often  assumed  in  life  testing  situations.  Such  studies  often  involve 
not  a full  sample,  but  rather  Type  II  censored  data.  That  is,  order 
statistics  are  observed  up  to  the  sample  m-quantile, 

X^l)  < < ...  < , 

where  [na]  is  the  greatest  integer  in  no  and  0 < o < 1.  It  is  natural  to 
make  use  of  random  cells  with  s;uiiple  quantiles  ^ as  cell 

boundaries.  Here  Cq  = 0,  ™ and 

0 = -Sq  < 6^  < ...  < = ,,  < 5^  = 1 , 

so  that  the  n - [no]  unobserved  X^^  fall  in  the  rightmost  cell.  Although  the 
cell  frequencies  are  now  fixed,  the  general  theory  of  Moore  and  Spruill 

(197S)  applies  to  this  choice  of  cells.  The  use  of  order  statistics  as  cell 
boundaries  was  considered  by  Witting  (19S9)  and  Bofinger  (1973),  but  this 
application  to  censored  data  seems  new.  I’or  references  to  previous  lit- 
erature on  tests  of  fit  for  censored  ilata,  see  Lurie,  Hartley,  and  Stroud 
(1974),  This  example  can  be  taken  as  a response  to  their  claim  that  "the 
chi-square  criterion  is  not  generally  applicable  to  testing  the  fit  of  Type 
II  censored  samples." 

The  Pearson-Fisher  Statistic . Estimate  0 by  the  grouped  data  MLF.  found 
as  the  solution  of  (5,2).  That  equation  becomes  in  this  case 

-C./0 

M C.  ,e  -4;.e  ,, 

y N = 

e -e 
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which  is  easily  solved  iteratively  to  obtain  0^  = » • • • » 

statistic  is 

2 . M [N  -np  (0-  )]^ 

X (0  ) = I — 

ir  np.  (0  ) 

1=1  ^1  n 


The  test 


where 


Ni  = [n6^]-[n6._j] 


(nonrandom) 


p.(0)  = e 


■h-l'O 

-e 


(random) . 


The  limiting  null  distribution  is  x (M-2). 

The  Wald ' s Method  Statistic . A more  powerful  chi-square  test  can  be 
obtained  by  use  of  the  raw  data  MI.H  of  0 from  the  censored  sample,  namely 
(Epstein  and  Sobel , 1953), 


1 W 

= -f — T-(  y X-...  + (n-[na])X,r 
n [na]  (i)  i J ^ ([no])' 


By  obtaining  the  limiting  distribution  of  V^(0^)  and  then  finding  the 
appropriate  quadratic  form,  a generalization  of  the  Rao-Robson  statistic 
to  censored  samples  can  be  obtained.  This  is  done  in  Mihalko  and  Moore 
(1977).  The  resulting  statistic  for  the  present  example  is 

K = X^(0„)  + (nD)‘^(  y N.v./p.(0  ))^ 
n ' n . 1 1 * 1 ^ n 

1=1 

where  and  Pj(0)  are  as  above,  and 


D = 1-e 


- r V.  /p.(e  ) . 

1=1 
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In  the  full  sample  case,  a = 1,  C = “»  N.,  = 0,  0 ^ x and  the  statistic 

M- 1 M n 

reduces  to  the  Rao-Robson  statistic  of  Example  1,  Section  3.3.3  (with 
M-1  cells  bounded  by  the  Cj). 

The  motivation  for  using  censored  data  when  lifetimes  or  survival 

times  are  being  measured  is  apparent  from  the  EXP  data  set.  The  sample 

80th  percentile  is  9.46,  while  the  maximum  of  the  100  observations  is 

39.12.  The  MLE  of  0 from  the  data  censored  at  a = 0.8  is  0 = 5.471, 

n 

compared  with  the  full  sample  MLE,  X = 5.415.  Experience  shows  that  the 

Roscoe-Byars  guidelines  are  not  adequate  to  ensure  accurate  critical  points 

from  the  distribution  in  the  present  situation,  where  the  np^  are  random 

and  unequal.  Tests  of  the  EXP  data  will  therefore  be  made  with  (a)  the 

full  sample  using  10  cells  having  the  sample  deciles  as  boundaries;  and 

(b)  the  data  censored  at  a = 0.8  using  9 cells  with  the  first  8 sample 

deciles  as  boundaries.  All  cells  except  the  rightmost  in  case  (b)  contain 

10  observations.  The  results  are,  for  the  full  sample, 

R = 6.132  + 0.220  = 6.352 
n 

2 

with  a P-value  of  0.704  from  x (9)*  For  the  censored  sample, 

R = 5.153  + 0.065  = 5.218 
n 

2 

with  a P-value  of  0.734  from  x (8).  These  results  are  comparable  to 
those  obtained  for  the  same  data  in  Example  1 of  Section  3.3.3. 


I 

i 


I 
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